Some theory for Fisher’s linear discriminant function, ‘naive Bayes’, and some alternatives when there are many more variables than observations

نویسنده

ELIZAVETA LEVINA

چکیده

We show that the ‘naive Bayes’ classifier which assumes independent covariates greatly outperforms the Fisher linear discriminant rule under broad conditions when the number of variables grows faster than the number of observations, in the classical problem of discriminating between two normal populations. We also introduce a class of rules spanning the range between independence and arbitrary dependence. These rules are shown to achieve Bayes consistency for the Gaussian ‘coloured noise’ model and to adapt to a spectrum of convergence rates, which we conjecture to be minimax.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supervised classification with conditional Gaussian networks: Increasing the structure complexity from naive Bayes

Most of the Bayesian network-based classifiers are usually only able to handle discrete variables. However, most real-world domains involve continuous variables. A common practice to deal with continuous variables is to discretize them, with a subsequent loss of information. This work shows how discrete classifier induction algorithms can be adapted to the conditional Gaussian network paradigm ...

متن کامل

Fisher’s Linear Discriminant Analysis for Weather Data by reproducing kernel Hilbert spaces framework

Recently with science and technology development, data with functional nature are easy to collect. Hence, statistical analysis of such data is of great importance. Similar to multivariate analysis, linear combinations of random variables have a key role in functional analysis. The role of Theory of Reproducing Kernel Hilbert Spaces is very important in this content. In this paper we study a gen...

متن کامل

Classic and Bayes Shrinkage Estimation in Rayleigh Distribution Using a Point Guess Based on Censored Data

Introduction In classical methods of statistics, the parameter of interest is estimated based on a random sample using natural estimators such as maximum likelihood or unbiased estimators (sample information). In practice, the researcher has a prior information about the parameter in the form of a point guess value. Information in the guess value is called as nonsample information. Thomp...

متن کامل

A New Classification Approach using Discriminant Functions

There are many algorithms for, and many applications of classification and discrimination (grouping of a set of objects into subsets of similar objects where the objects in different subsets are different) in several diverse fields [2-15, 23, 24], ranging from engineering to medicine, to econometrics, etc. Some examples are automatic target recognition (ATR), fault and maintenance-time recognit...

متن کامل

Analysis of sequential physiology data with weighted naive Bayes

In this project, I describe how I address the ICML 2004 Physiological Data Modeling Contest. For the gender prediction task, I compressed the large entry-based dataset to small session-based dataset and manually devised 90 features using a histogram method. Weighted naive Bayes (WNB) which is an extension of naive Bayes was applied and Markov Chain Monte Carlo was combined to solve the weight u...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Some theory for Fisher’s linear discriminant function, ‘naive Bayes’, and some alternatives when there are many more variables than observations

نویسنده

چکیده

منابع مشابه

Supervised classification with conditional Gaussian networks: Increasing the structure complexity from naive Bayes

Fisher’s Linear Discriminant Analysis for Weather Data by reproducing kernel Hilbert spaces framework

Classic and Bayes Shrinkage Estimation in Rayleigh Distribution Using a Point Guess Based on Censored Data

A New Classification Approach using Discriminant Functions

Analysis of sequential physiology data with weighted naive Bayes

عنوان ژورنال:

اشتراک گذاری